Search CORE

85 research outputs found

Unrestricted Bridging Resolution

Author: Hou Yufang
Publication venue
Publication date: 01/01/2016
Field of study

Anaphora plays a major role in discourse comprehension and accounts for the coherence of a text. In contrast to identity anaphora which indicates that a noun phrase refers back to the same entity introduced by previous descriptions in the discourse, bridging anaphora or associative anaphora links anaphors and antecedents via lexico-semantic, frame or encyclopedic relations. In recent years, various computational approaches have been developed for bridging resolution. However, most of them only consider antecedent selection, assuming that bridging anaphora recognition has been performed. Moreover, they often focus on subproblems, e.g., only part-of bridging or definite noun phrase anaphora. This thesis addresses the problem of unrestricted bridging resolution, i.e., recognizing bridging anaphora and finding links to antecedents where bridging anaphors are not limited to definite noun phrases and semantic relations between anaphors and their antecedents are not restricted to meronymic relations. In this thesis, we solve the problem using a two-stage statistical model. Given all mentions in a document, the first stage predicts bridging anaphors by exploring a cascading collective classification model. We cast bridging anaphora recognition as a subtask of learning fine-grained information status (IS). Each mention in a text gets assigned one IS class, bridging being one possible class. The model combines the binary classifiers for minority categories and a collective classifier for all categories in a cascaded way. It addresses the multi-class imbalance problem (e.g., the wide variation of bridging anaphora and their relative rarity compared to many other IS classes) within a multi-class setting while still keeping the strength of the collective classifier by investigating relational autocorrelation among several IS classes. The second stage finds the antecedents for all predicted bridging anaphors at the same time by exploring a joint inference model. The approach models two mutually supportive tasks (i.e., bridging anaphora resolution and sibling anaphors clustering) jointly, on the basis of the observation that semantically/syntactically related anaphors are likely to be sibling anaphors, and hence share the same antecedent. Both components are based on rich linguistically-motivated features and discriminatively trained on a corpus (ISNotes) where bridging is reliably annotated. Our approaches achieve substantial improvements over the reimplementations of previous systems for all three tasks, i.e., bridging anaphora recognition, bridging anaphora resolution and full bridging resolution. The work is – to our knowledge – the first bridging resolution system that handles the unrestricted phenomenon in a realistic setting. The methods in this dissertation were originally presented in Markert et al. (2012) and Hou et al. (2013a; 2013b; 2014). The thesis gives a detailed exposition, carrying out a thorough corpus analysis of bridging and conducting a detailed comparison of our models to others in the literature, and also presents several extensions of the aforementioned papers

Heidelberger Dokumentenserver

Extraction and Purification of a Lectin from Red Kidney Bean and Preliminary Immune Function Studies of the Lectin and Four Chinese Herbal Polysaccharides

Author: Hou Yubao
Hou Yufang
Li Jichang
Qin Guang
Yanyan Liu
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2010
Field of study

Reversed micelles were used to extract lectin from red kidney beans and factors affecting reverse micellar systems (pH value, ionic strength and extraction time) were studied. The optimal conditions were extraction at pH 4–6, back extraction at pH 9–11, ion strength at 0.15 M NaCl, extraction for 4–6 minutes and back extraction for 8 minutes. The reverse micellar system was compared with traditional extraction methods and demonstrated to be a time-saving method for the extraction of red kidney bean lectin. Mitogenic activity of the lectin was reasonably good compared with commercial phytohemagglutinin (extracted from Phaseolus vulgaris) Mitogenic properties of the lectin were enhanced when four Chinese herbal polysaccharides were applied concurrently, among which 50 μg/mL Astragalus mongholicus polysaccharides (APS) with 12.5 μg/mL red kidney bean lectin yielded the highest mitogenic activity and 100 mg/kg/bw APS with 12.5 mg/kg/bw red kidney bean lectin elevated mouse nonspecific immunity

Crossref

Directory of Open Access Journals

PubMed Central

'Don't Get Too Technical with Me': A Discourse Structure-Based Framework for Science Journalism

Author: Cardenas Ronald
Hou Yufang
Wang Dakuo
Yao Bingsheng
Publication venue
Publication date: 23/10/2023
Field of study

Science journalism refers to the task of reporting technical findings of a scientific paper as a less technical news article to the general public audience. We aim to design an automated system to support this real-world task (i.e., automatic science journalism) by 1) introducing a newly-constructed and real-world dataset (SciTechNews), with tuples of a publicly-available scientific paper, its corresponding news article, and an expert-written short summary snippet; 2) proposing a novel technical framework that integrates a paper's discourse structure with its metadata to guide generation; and, 3) demonstrating with extensive automatic and human experiments that our framework outperforms other baseline methods (e.g. Alpaca and ChatGPT) in elaborating a content plan meaningful for the target audience, simplifying the information selected, and producing a coherent final report in a layman's style.Comment: Accepted to EMNLP 202

arXiv.org e-Print Archive

Probing for Bridging Inference in Transformer Language Models

Author: Hou Yufang
Pandit Onkar
Publication venue: HAL CCSD
Publication date: 06/06/2021
Field of study

International audienceWe probe pre-trained transformer language models for bridging inference. We first investigate individual attention heads in BERT and observe that attention heads at higher layers prominently focus on bridging relations incomparison with the lower and middle layers, also, few specific attention heads concentrate consistently on bridging. More importantly, we consider language models as a whole in our second approach where bridging anaphora resolution is formulated as a masked token prediction task (Of-Cloze test). Our formulation produces optimistic results without any finetuning, which indicates that pre-trained language models substantially capture bridging inference. Our further investigation shows that the distance between anaphor-antecedent and the context provided to language models play an important role in the inference

INRIA a CCSD electronic archive server